295 research outputs found
HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Model quantization is a widely used technique to compress and accelerate deep
neural network (DNN) inference. Emergent DNN hardware accelerators begin to
support mixed precision (1-8 bits) to further improve the computation
efficiency, which raises a great challenge to find the optimal bitwidth for
each layer: it requires domain experts to explore the vast design space trading
off among accuracy, latency, energy, and model size, which is both
time-consuming and sub-optimal. Conventional quantization algorithm ignores the
different hardware architectures and quantizes all the layers in a uniform way.
In this paper, we introduce the Hardware-Aware Automated Quantization (HAQ)
framework which leverages the reinforcement learning to automatically determine
the quantization policy, and we take the hardware accelerator's feedback in the
design loop. Rather than relying on proxy signals such as FLOPs and model size,
we employ a hardware simulator to generate direct feedback signals (latency and
energy) to the RL agent. Compared with conventional methods, our framework is
fully automated and can specialize the quantization policy for different neural
network architectures and hardware architectures. Our framework effectively
reduced the latency by 1.4-1.95x and the energy consumption by 1.9x with
negligible loss of accuracy compared with the fixed bitwidth (8 bits)
quantization. Our framework reveals that the optimal policies on different
hardware architectures (i.e., edge and cloud architectures) under different
resource constraints (i.e., latency, energy and model size) are drastically
different. We interpreted the implication of different quantization policies,
which offer insights for both neural network architecture design and hardware
architecture design.Comment: CVPR 2019. The first three authors contributed equally to this work.
Project page: https://hanlab.mit.edu/projects/haq
A Novel Method for the Absolute Pose Problem with Pairwise Constraints
Absolute pose estimation is a fundamental problem in computer vision, and it
is a typical parameter estimation problem, meaning that efforts to solve it
will always suffer from outlier-contaminated data. Conventionally, for a fixed
dimensionality d and the number of measurements N, a robust estimation problem
cannot be solved faster than O(N^d). Furthermore, it is almost impossible to
remove d from the exponent of the runtime of a globally optimal algorithm.
However, absolute pose estimation is a geometric parameter estimation problem,
and thus has special constraints. In this paper, we consider pairwise
constraints and propose a globally optimal algorithm for solving the absolute
pose estimation problem. The proposed algorithm has a linear complexity in the
number of correspondences at a given outlier ratio. Concretely, we first
decouple the rotation and the translation subproblems by utilizing the pairwise
constraints, and then we solve the rotation subproblem using the
branch-and-bound algorithm. Lastly, we estimate the translation based on the
known rotation by using another branch-and-bound algorithm. The advantages of
our method are demonstrated via thorough testing on both synthetic and
real-world dataComment: 10 pages, 7figure
Bi-directional Weakly Supervised Knowledge Distillation for Whole Slide Image Classification
Computer-aided pathology diagnosis based on the classification of Whole Slide
Image (WSI) plays an important role in clinical practice, and it is often
formulated as a weakly-supervised Multiple Instance Learning (MIL) problem.
Existing methods solve this problem from either a bag classification or an
instance classification perspective. In this paper, we propose an end-to-end
weakly supervised knowledge distillation framework (WENO) for WSI
classification, which integrates a bag classifier and an instance classifier in
a knowledge distillation framework to mutually improve the performance of both
classifiers. Specifically, an attention-based bag classifier is used as the
teacher network, which is trained with weak bag labels, and an instance
classifier is used as the student network, which is trained using the
normalized attention scores obtained from the teacher network as soft pseudo
labels for the instances in positive bags. An instance feature extractor is
shared between the teacher and the student to further enhance the knowledge
exchange between them. In addition, we propose a hard positive instance mining
strategy based on the output of the student network to force the teacher
network to keep mining hard positive instances. WENO is a plug-and-play
framework that can be easily applied to any existing attention-based bag
classification methods. Extensive experiments on five datasets demonstrate the
efficiency of WENO. Code is available at https://github.com/miccaiif/WENO.Comment: Accepted by NeurIPS 202
- …